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Amendment to the Claims: 

This listing of claims will replace all prioT versions and listings of claims in the 
application: 



Listing of Claims: 

1 . (original) A method of determining a statistical mode] for predicting disease risk 
for a member of a population, 

a. collecting a plurality of sets of data, each of said sets of data associated with 
one member of said population, and comprising data of a first type, data of a 
second type, and an indicator of disease status of said one member associated 
with said set; 

b. selecting a candidate statistical model for calculating said disease risk as a 
function of data of said first type, said candidate model dependent on a 
plurality of parameters; 

c. determining a plurality of weights, each one of said weights associated with 
one of said sets of data and indicating a statistical significance of said one of 
said sets of data, wherein weights associated with sets of said data having like 
data of said second type are the same; and 

d. optimizing said parameters of said candidate model by fitting said plurality of 
sets of data to said candidate model, taking into account said weights, 

2. (original) The method of claim 1 , wherein data of said first type is non-genetic 
data and data of said second type is genetic data. 

3. (original) The method of claim J, wherein said corresponding weights arc used to 
assess a goodness of said fitting. 

4. (original) The method of claim 1, wherein said determining comprises: 
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a. grouping said collected data into groups such that all sets of data within each 
said group have like data of said second type, one of said groups being a 
reference group which contains sets of data having data of said second type 
like data of said second type obtained from said member of said population; 
and 

h. determining a group weight for each said group, whereby said group weight is 
the corresponding weight for each set of data within said each group. 

5. (original) The method of claim 4, wherein the group weight of said reference 
group has a value of one and each of the other group weights has a value between 
zero and one. 

6. (original) The method of claim 5, wherein said other group weights are optimized 
by rnmimizing a target function, said target function dependent on a plurality of 
residuals, one of said residuals for each of the data sets in said reference group, 

7. (original) The method of claim 6, wherein a residual for the fth one of said data 
sets is the difference between the value of the indicator of disease status contained 
in said rth data set and the value of disease risk for the member associated with 
said rth data set , said value of disease risk calculated from said candidate model 
with said parameters optimized for a given set of group weights by fitting data sets 
in groups other than the reference group to said candidate model. 

8. (original) The method of claim 7, wherein said target function is of the form: 

where 

Wi is the corresponding weight for data set /; and 
n is the residual for data set t 

9. (original) The method of claim 1 , wherein data of said first type comprises data 
indicative of time. 
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10. (original) The method of claim 9, wherein said candidate model is a Cox 
proportional hazard regression model 

11. (original) The method of claim 9, wherein said candidate model is a disease risk 
function of the form: 

tf(0 = l-exp{-j^(w)<M> 

where 

R(t) represents said disease risk at a given time t; 
h(u) is ofthefomi: 

A(«) = A,W ex P(SM)i 
ho(u) is dependent only on u\ 

xi is a variable indicative of a disease risk factor, said collected data containing 
a plurality of values of X£ 
fi is a coefficient for and 

n c is the number of coefficients in said disease risk function. 

12. (original) The method of claim 1, wherein said collecting comprises imputing 
missing data to said plurality of data sets. 

13. (original) The method of claim 1, wherein each corresponding weight is weighted 
by an adjustment factor indicative of the representativeness of the member 
associated with said each corresponding weight 

14. (Original) The method of claim 13, wherein an adjustment factor a, for a data set 
obtained from a member i of said population is calculated as: 
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where n > is the number of members in said population who share a same set of 
characteristics with said member U and nf is the number of members associated 
with said collected data who share said set of characteristics. 

15. (original) The method of claim 14, wherein said set of characteristics comprises 
non-genetic factors. 

16. (original) The method of claim 14, wherein said set of characteristics comprises 
genetic factors. 

17. (original) The method of claim 14, wherein said set of characteristics comprises 
both genetic and non-genetic factors. 

18. (original) The method of claim 14, wherein said set of characteristics are selected 
from the group of age, gender, race, body mass index, smoking status, 
hypertension, cholesterol level, personal health history, and family health history. 

19. (original) The method of claim 1, comprising calculating a disease risk for said 
member of said population with said disease risk prediction model . 

20. (original) A computing system adapted for perform the method of any one of 
claims 1 to 19. 

21. (original) An article of manufacture comprising 

a computer readable medium embedded thereon computer executable instructions, 
which when executed by a computer causes said computer to determine a 
statistical model for predicting disease risk for a member of a population by 

a. collecting a plurality of sets of data, each of said sets of data associated with 
one member of said population, and comprising data of a first type, data of a 
second type, and an indicator of disease status of said one member associated 
with said set; 
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b. selecting a candidate statistical model for calculating said disease risk as a 
function of data of said first type, said candidate model dependent on a 
plurality of parameters; 

c. determining a plurality of weights, each one of said weights associated with 
one of said sets of data and indicating a statistical significance of said one of 
said sets of data, wherein weights associated with sets of said data having like 
data of said second type are the same; and 

d. optimizing said parameters of said candidate model by fitting said plurality of 
sets of data to said candidate model, taking into account said weights. 

22. (currently amended) A nncthod of imputing missing data indioativo of a plurality 

f l f ft rt nn. grisiagi The method of c laim 1. wherein each of said sets of data is 

indicative of a plurality of factors, and said c ollecting comprises: 

a. determining a correlation between said plurality of factors; 

b. grouping said factors into batches such that all factors in each said batch are 
correlated; and 

c. imputing missin g data for factors in one said batch at a time. 

23. (currently amended) A method of grouping a plurality of dato cots into grou p s 
rnmprirfn ^The method nf claim 4. wherein said grouping comprises: 

a. dividing said plurality of data sets of data into two or more groups depending 
on data indicative of a factor of a said first type in each of said data sets; 

b. determining if a criterion is met after said dividing, said criterion is evaluated 
based on data of a said second type in each of said data sets; and 

c. when said criterion is not met, regrouping said plurality of data sets of data 
back into one group. 

24. (original) The method of claim 23, wherein said dividing is performed recursively 
on each group of a division. 
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25. (original) The method of claim 24, wherein divisions at different levels are made 
dependent on data indicative of different factors. 

26. (original) The method of claim 25, wherein a branch of said recursive division is 
terminated at the level at which said criterion is not met. 

27. (original) A method of weighing a plurality of data sets, each one of said data sets 
associated with a member of a population, comprising: 

weighing each set of said plurality of data sets by a weight indicative of the 
representativeness of the member associated with said each set, wherein a 
weight a t for a data set obtained from a member i of said population is 
calculated as: 



where „f is the number of members in said population who share a same set 
of characteristics with said member i, and nf is the number of members 
associated with said collected data who share said set of characteristics. 
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